Web-based Affiliation Matching

نویسندگان

  • David Aumüller
  • Erhard Rahm
چکیده

Authors of scholarly publications state their affiliation in various forms. This kind of heterogeneity makes bibliographic analysis tasks on institutions impossible unless a comprehensive cleaning and consolidation of affiliation data is performed. We investigate automatic approaches to consolidate affiliation data to reduce manual work and support scalability of affiliation analysis. In particular, we propose to set up a reference database of affiliation strings found in publications. A key step in this task is the matching of different affiliation strings to determine whether or not they match. For affiliation matching we investigate web based similarity measures utilizing the cognitive power of current search engines. They determine the similarity of affiliations based on how the URLs in the result sets of affiliation web searches overlap. We evaluate the effectiveness of affiliation matching based on URL overlap as well as for the combined use with the Soft TF-IDF similarity measure.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A procedure for Web Service Selection Using WS-Policy Semantic Matching

In general, Policy-based approaches play an important role in the management of web services, for instance, in the choice of semantic web service and quality of services (QoS) in particular. The present research work illustrates a procedure for the web service selection among functionality similar web services based on WS-Policy semantic matching. In this study, the procedure of WS-Policy publi...

متن کامل

Centralized Clustering Method To Increase Accuracy In Ontology Matching Systems

Ontology is the main infrastructure of the Semantic Web which provides facilities for integration, searching and sharing of information on the web. Development of ontologies as the basis of semantic web and their heterogeneities have led to the existence of ontology matching. By emerging large-scale ontologies in real domain, the ontology matching systems faced with some problem like memory con...

متن کامل

Automated Document Labeling

An increasing number of publishers are using the Internet and the World Wide Web to provide their subscribers with access to online journals. New techniques are needed to capture, classify, analyze, extract, modify, and reformat Web-based document information for computer storage, access, and processing. An R&D division of the National Library of Medicine (NLM) is developing an automated system...

متن کامل

Using Generalized Language Model for Question Matching

Question and answering service is one of the popular services in the World Wide Web. The main goal of these services is to finding the best answer for user's input question as quick as possible. In order to achieve this aim, most of these use new techniques foe question matching. . We have a lot of question and answering services in Persian web, so it seems that developing a question matching m...

متن کامل

Dealing with Semantic Heterogeneity Issues on the Web

The Semantic Web is an extension of the traditional Web in which meaning of information is well defined, thusallowing a better interaction between people and computers. To accomplish its goals, mechanisms are required tomake explicit the semantics of Web resources, to be automatically processed by software agents (this semantics beingdescribed by means of online ontologies). Neverth...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009